Processing Spontaneous Orthography
نویسندگان
چکیده
In cases in which there is no standard orthography for a language or language variant, written texts will display a variety of orthographic choices. This is problematic for natural language processing (NLP) because it creates spurious data sparseness. We study the transformation of spontaneously spelled Egyptian Arabic into a conventionalized orthography which we have previously proposed for NLP purposes. We show that a two-stage process can reduce divergences from this standard by 69%, making subsequent processing of Egyptian Arabic easier.
منابع مشابه
Computational Linguistics & Chinese Language Processing Aims and Scope Contents Special Issue Articles: Processing Lexical Tones in Natural Speech Implicit Priming Effects in Chinese Word Recall: the Role of Orthography and Tones in the Mental Lexicon
This paper explores the relative contributions made by orthography, syllabic segment, and lexical tone in the word recognition and retrieval process. It also challenges recent assumptions regarding the role of orthography and tones in mental lexicon architecture. Using an implicit priming paradigm, a word recognition experiment was conducted with native speakers of two tonal languages, Chinese ...
متن کاملPaying attention to orthography: a visual evoked potential study
In adult readers, letters, and words are rapidly identified within visual networks to allow for efficient reading abilities. Neuroimaging studies of orthography have mostly used words and letter strings that recruit many hierarchical levels in reading. Understanding how single letters are processed could provide further insight into orthographic processing. The present study investigated orthog...
متن کاملThe Effect of L1 Persian on the Acquisition of English L2 Orthographic System on the Shared Grounds
This paper elaborates on Persian and English orthographic shared aspects to study the effects of L1 Persian on learning English as a foreign language. While there are some examples of letter and sound mismatches in the orthographic system of both languages, those of English are more complex than Persian. In order to see the effect of the mismatch between orthography and transcription, 40 Persia...
متن کاملMetalinguistic awareness and reading performance: a cross language comparison.
The study examined two questions: (1) do the greater phonological awareness skills of billinguals affect reading performance; (2) to what extent do the orthographic characteristics of a language influence reading performance and how does this interact with the effects of phonological awareness. We estimated phonological metalinguistic abilities and reading measures in three groups of first grad...
متن کاملLearning to spell in a language with transparent orthography: Distributional properties of orthography and whole-word lexical processing.
We examined how whole-word lexical information and knowledge of distributional properties of orthography interact in children's spelling. High- versus low-frequency words, which included inconsistently spelled segments occurring more or less frequently in the orthography, were used in two experiments: (a) word spelling; (b) lexical priming of pseudoword spelling. Participants were 1st-, 2nd-, a...
متن کامل